Pushdown sort aggregate metrics#4603
Conversation
Signed-off-by: Lantao Jin <ltjin@amazon.com>
Signed-off-by: Lantao Jin <ltjin@amazon.com>
Signed-off-by: Lantao Jin <ltjin@amazon.com>
Signed-off-by: Lantao Jin <ltjin@amazon.com>
Signed-off-by: Lantao Jin <ltjin@amazon.com>
...src/main/java/org/opensearch/sql/opensearch/planner/physical/SortAggregationMetricsRule.java
Outdated
Show resolved
Hide resolved
...arch/src/main/java/org/opensearch/sql/opensearch/storage/scan/context/AggPushDownAction.java
Outdated
Show resolved
Hide resolved
| return newTraitSet; | ||
| } | ||
|
|
||
| public CalciteLogicalIndexScan pushDownSortAggregateMetrics(Sort sort) { |
There was a problem hiding this comment.
We can only do this operation when bucket_nullable=false, right? Any place to check this parameter?
There was a problem hiding this comment.
[nit] Will it be better to put these checks in one place if possible? Now the prerequisites are split both here and in aggAction.pushDownSortAggMetrics, although they are both aggregationBuilders and some checking seems overlapped.
There was a problem hiding this comment.
Cannot get the bucket_nullable information here. Aggregation has already pushed down. So AggPushDownAction.L97 and L126 would be better IMO.
There was a problem hiding this comment.
But I see you've already get the pushed aggregation builder here.
List<AggregationBuilder> aggregationBuilders =
pushDownContext.getAggPushDownAction().getAggregationBuilder().getLeft();
Or otherwise L290-310 could all be moved into aggAction.pushDownSortAggMetrics.
It's also fine to me if not making change since the logic is correct anyway.
There was a problem hiding this comment.
I see, but I don't want always to check CompositeValuesSourceBuilder.missingBucket() here since not all ValuesSourceBuilders are impacted by bucket_nullable, for example date histogram. So moving to here has to check types, sounds not worth to check types twice.
Signed-off-by: Lantao Jin <ltjin@amazon.com>
Signed-off-by: Lantao Jin <ltjin@amazon.com>
Signed-off-by: Lantao Jin <ltjin@amazon.com>
| return newTraitSet; | ||
| } | ||
|
|
||
| public CalciteLogicalIndexScan pushDownSortAggregateMetrics(Sort sort) { |
There was a problem hiding this comment.
[nit] Will it be better to put these checks in one place if possible? Now the prerequisites are split both here and in aggAction.pushDownSortAggMetrics, although they are both aggregationBuilders and some checking seems overlapped.
...arch/src/main/java/org/opensearch/sql/opensearch/storage/scan/context/AggPushDownAction.java
Show resolved
Hide resolved
… buckets Signed-off-by: Lantao Jin <ltjin@amazon.com>
| // count agg optimized, get the path name from field names | ||
| path = fieldNames.get(collations.get(0).getFieldIndex()); | ||
| } else if (metric instanceof ValuesSourceAggregationBuilder.LeafOnly) { | ||
| path = metric.getName(); |
There was a problem hiding this comment.
[nit, non-blocking] Could this be combined with above branch? i.e. Does path = fieldNames.get(collations.get(0).getFieldIndex()); also works for metric instanceof ValuesSourceAggregationBuilder.LeafOnly?
| List<String> fieldNames, | ||
| CompositeAggregationBuilder composite) { | ||
| String path; | ||
| AggregationBuilder metric = composite.getSubAggregations().stream().findFirst().orElse(null); |
There was a problem hiding this comment.
Do we only support only 1 agg metric for this enhancement? And anywhere to check this?
There was a problem hiding this comment.
Do we only support only 1 agg metric for this enhancement? And anywhere to check this?
There was a problem hiding this comment.
good catching. added restricted checking.
Signed-off-by: Lantao Jin <ltjin@amazon.com>
|
The backport to To backport manually, run these commands in your terminal: # Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/sql/backport-2.19-dev 2.19-dev
# Navigate to the new working tree
pushd ../.worktrees/sql/backport-2.19-dev
# Create a new branch
git switch --create backport/backport-4603-to-2.19-dev
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 5ebed8465dc9d498602a25b802207d63ee5030c8
# Push it to GitHub
git push --set-upstream origin backport/backport-4603-to-2.19-dev
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/sql/backport-2.19-devThen, create a pull request where the |
* convert sort aggregate metrics to term sort Signed-off-by: Lantao Jin <ltjin@amazon.com> * refactor Signed-off-by: Lantao Jin <ltjin@amazon.com> * fix conflicts Signed-off-by: Lantao Jin <ltjin@amazon.com> * fix conflicts2 Signed-off-by: Lantao Jin <ltjin@amazon.com> * Add more javadoc Signed-off-by: Lantao Jin <ltjin@amazon.com> * address comments Signed-off-by: Lantao Jin <ltjin@amazon.com> * delete incorrect comments Signed-off-by: Lantao Jin <ltjin@amazon.com> * convert composite agg to multi-terms agg for sort metrics on multiple buckets Signed-off-by: Lantao Jin <ltjin@amazon.com> * avoid the case of 'stat count, sum ... | sort count' Signed-off-by: Lantao Jin <ltjin@amazon.com> --------- Signed-off-by: Lantao Jin <ltjin@amazon.com> (cherry picked from commit 5ebed84)
* Pushdown sort aggregate metrics (#4603) * convert sort aggregate metrics to term sort Signed-off-by: Lantao Jin <ltjin@amazon.com> * refactor Signed-off-by: Lantao Jin <ltjin@amazon.com> * fix conflicts Signed-off-by: Lantao Jin <ltjin@amazon.com> * fix conflicts2 Signed-off-by: Lantao Jin <ltjin@amazon.com> * Add more javadoc Signed-off-by: Lantao Jin <ltjin@amazon.com> * address comments Signed-off-by: Lantao Jin <ltjin@amazon.com> * delete incorrect comments Signed-off-by: Lantao Jin <ltjin@amazon.com> * convert composite agg to multi-terms agg for sort metrics on multiple buckets Signed-off-by: Lantao Jin <ltjin@amazon.com> * avoid the case of 'stat count, sum ... | sort count' Signed-off-by: Lantao Jin <ltjin@amazon.com> --------- Signed-off-by: Lantao Jin <ltjin@amazon.com> (cherry picked from commit 5ebed84) * fix conflicts Signed-off-by: Lantao Jin <ltjin@amazon.com> --------- Signed-off-by: Lantao Jin <ltjin@amazon.com>
* default-main: (34 commits) Enhance dynamic source clause to support only metadata filters (opensearch-project#4554) Make nested alias type support referring to outer context (opensearch-project#4673) Update big5 ppl queries and check plans (opensearch-project#4668) Support push down sort after limit (opensearch-project#4657) Use table scan rowType in filter pushdown could fix rename issue (opensearch-project#4670) Fix: Support Alias Fields in MIN, MAX, FIRST, LAST, and TAKE Aggregations (opensearch-project#4621) Fix bin nested fields issue (opensearch-project#4606) Add `per_minute`, `per_hour`, `per_day` function support (opensearch-project#4531) Pushdown sort aggregate metrics (opensearch-project#4603) Followup: Change ComparableLinkedHashMap to compare Key than Value (opensearch-project#4648) Mitigate the CI failure caused by 500 Internal Server Error (opensearch-project#4646) Allow renaming group-by fields to existing field names (opensearch-project#4586) Publish internal modules separately for downstream reuse (opensearch-project#4484) Revert "Update grammar files and developer guide (opensearch-project#4301)" (opensearch-project#4643) Support Automatic Type Conversion for REX/SPATH/PARSE Command Extractions (opensearch-project#4599) Replace all dots in fields of table scan's PhysType (opensearch-project#4633) Return comparable LinkedHashMap in `valueForCalcite()` of ExprTupleValue (opensearch-project#4629) Refactor JsonExtractAllFunctionIT and MapConcatFunctionIT (opensearch-project#4623) Pushdown case function in aggregations as range queries (opensearch-project#4400) Update GEOIP function to support IP types as input (opensearch-project#4613) ... # Conflicts: # docs/user/ppl/functions/conversion.rst
* default-main: (34 commits) Enhance dynamic source clause to support only metadata filters (opensearch-project#4554) Make nested alias type support referring to outer context (opensearch-project#4673) Update big5 ppl queries and check plans (opensearch-project#4668) Support push down sort after limit (opensearch-project#4657) Use table scan rowType in filter pushdown could fix rename issue (opensearch-project#4670) Fix: Support Alias Fields in MIN, MAX, FIRST, LAST, and TAKE Aggregations (opensearch-project#4621) Fix bin nested fields issue (opensearch-project#4606) Add `per_minute`, `per_hour`, `per_day` function support (opensearch-project#4531) Pushdown sort aggregate metrics (opensearch-project#4603) Followup: Change ComparableLinkedHashMap to compare Key than Value (opensearch-project#4648) Mitigate the CI failure caused by 500 Internal Server Error (opensearch-project#4646) Allow renaming group-by fields to existing field names (opensearch-project#4586) Publish internal modules separately for downstream reuse (opensearch-project#4484) Revert "Update grammar files and developer guide (opensearch-project#4301)" (opensearch-project#4643) Support Automatic Type Conversion for REX/SPATH/PARSE Command Extractions (opensearch-project#4599) Replace all dots in fields of table scan's PhysType (opensearch-project#4633) Return comparable LinkedHashMap in `valueForCalcite()` of ExprTupleValue (opensearch-project#4629) Refactor JsonExtractAllFunctionIT and MapConcatFunctionIT (opensearch-project#4623) Pushdown case function in aggregations as range queries (opensearch-project#4400) Update GEOIP function to support IP types as input (opensearch-project#4613) ... Signed-off-by: Asif Bashar <asif.bashar@gmail.com>
* convert sort aggregate metrics to term sort Signed-off-by: Lantao Jin <ltjin@amazon.com> * refactor Signed-off-by: Lantao Jin <ltjin@amazon.com> * fix conflicts Signed-off-by: Lantao Jin <ltjin@amazon.com> * fix conflicts2 Signed-off-by: Lantao Jin <ltjin@amazon.com> * Add more javadoc Signed-off-by: Lantao Jin <ltjin@amazon.com> * address comments Signed-off-by: Lantao Jin <ltjin@amazon.com> * delete incorrect comments Signed-off-by: Lantao Jin <ltjin@amazon.com> * convert composite agg to multi-terms agg for sort metrics on multiple buckets Signed-off-by: Lantao Jin <ltjin@amazon.com> * avoid the case of 'stat count, sum ... | sort count' Signed-off-by: Lantao Jin <ltjin@amazon.com> --------- Signed-off-by: Lantao Jin <ltjin@amazon.com>
Description
Pushdown sort aggregate metrics:
SORT_AGG_METRICSafterAGGREGATIONin pushdown contextcompositebucket agg toterms/histo/date-histoagg if sort one metric onsinglebucketcompositebucket agg (all terms source) tomulti-termsagg if sort one metric onmultiplebucketsRelated Issues
Resolves #4282
Check List
--signoffor-s.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.